Part 2: Intro to Federated Learning

In the last section, we learned about PointerTensors, wey create the underlying infrastructure wey we need for privacy preserving Deep Learning. In this section we go see how we fit use these basic tools to implement our first privacy preserving deep learning algorithm, Federated Learning.

Person wey write am:

Person wey translate am:

Wetin be Federated Learning?

Na the simple, powerful way wey we fit train Deep Learning models. If you dey think about training data, you go get the result from collection process. People (via devices) dey generate data by recording events wey dey our world. Normally, we go join the data togeda make dey turn one tin, central location so that he go fit train machine learning model. Federated Learning go turn this for him head!

Instead make we bring training data to the model (a central server), you go bring the model to the training data (wherever he dey)

The idea go allow any person wey dey create the data to get the permanent copy, and you got fit get control over person wey get access to am. He make sense, eh?

Section 2.1 - A Toy Federated Learning Example

Make we start to train a toy model the centralized way. This is about a simple as the models get. We go need:

  • a toy dataset
  • a model
  • some basic training logic for training a model to fit the data.

Not: If you no sabi this API - go check fast.ai and make you take the course before you go continue this tutoria.


In [ ]:
import torch
from torch import nn
from torch import optim

In [ ]:
# A Toy Dataset
data = torch.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)
target = torch.tensor([[0],[0],[1],[1.]], requires_grad=True)

# A Toy Model
model = nn.Linear(2,1)

def train():
    # Training Logic
    opt = optim.SGD(params=model.parameters(),lr=0.1)
    for iter in range(20):

        # 1) erase previous gradients (if they exist)
        opt.zero_grad()

        # 2) make a prediction
        pred = model(data)

        # 3) calculate how much we missed
        loss = ((pred - target)**2).sum()

        # 4) figure out which weights caused us to miss
        loss.backward()

        # 5) change those weights
        opt.step()

        # 6) print our progress
        print(loss.data)

In [ ]:
train()

Na so we get am! We go train basic model for conventional manner. All our data go dey inside our local machine and we go use am do updates to our model. Federated Learning, no dey work like that. So, make we add one or two tins on the example to do Federated Learning!

So, wetin we go need:

  • create a couple workers
  • get pointers to training data on each worker
  • updated training logic to do federated learning

    New Training Steps:

    • send model to correct worker
    • train on the data located there
    • get the model back and repeat with next worker

In [ ]:
import syft as sy
hook = sy.TorchHook(torch)

In [ ]:
# create a couple workers

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")

In [ ]:
# A Toy Dataset
data = torch.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)
target = torch.tensor([[0],[0],[1],[1.]], requires_grad=True)

# get pointers to training data on each worker by
# sending some training data to bob and alice
data_bob = data[0:2]
target_bob = target[0:2]

data_alice = data[2:]
target_alice = target[2:]

# Iniitalize A Toy Model
model = nn.Linear(2,1)

data_bob = data_bob.send(bob)
data_alice = data_alice.send(alice)
target_bob = target_bob.send(bob)
target_alice = target_alice.send(alice)

# organize pointers into a list
datasets = [(data_bob,target_bob),(data_alice,target_alice)]

opt = optim.SGD(params=model.parameters(),lr=0.1)

In [ ]:
def train():
    # Training Logic
    opt = optim.SGD(params=model.parameters(),lr=0.1)
    for iter in range(10):
        
        # NEW) iterate through each worker's dataset
        for data,target in datasets:
            
            # NEW) send model to correct worker
            model.send(data.location)

            # 1) erase previous gradients (if they exist)
            opt.zero_grad()

            # 2) make a prediction
            pred = model(data)

            # 3) calculate how much we missed
            loss = ((pred - target)**2).sum()

            # 4) figure out which weights caused us to miss
            loss.backward()

            # 5) change those weights
            opt.step()
            
            # NEW) get model (with gradients)
            model.get()

            # 6) print our progress
            print(loss.get()) # NEW) slight edit... need to call .get() on loss\
    
# federated averaging

In [ ]:
train()

Well Done!

And voilà! We don dey train simple Deep Learning model using Federated Learning! We go send the model to each worker, generate a new gradient, and then we go bring the gradient back to our local server where we go update our global model. For this process, we no go eva see or request access to the training data wey we dey use! We go preserve Bob and Alice privacy!!!

Shortcomings of this Example

As we see sey this example na nice introduction to Federated Learning, it still get him own wahala. The one wey we sabi na, when we call model.get()and receive the updated model from Bob or Alice, we go learn plenti tins about Bob and Alice's training data if we look their gradients. We go fit restore their training data in some cases!

Wetin we fit do? First startegy people sabi na to average the gradient across multiple individuals before uploading it to the central server. This startegy, go require make we use sophisticated PointerTensor objects. So, in the next section, we go learn about more advanced pointer functionality and then we go fit upgrade this Federated Learning example.

Congratulations!!! - Oya Join the Community!

Clap for una sef as you don finish this notebook tutorial! If you enjoy am and you wan join the movement towards privacy preserving, decentralized ownership of AI and the AI supply chain (data), follow the steps wey dey below.

Star PySyft on GitHub

The easiset way to helep our community na to star the GitHub repos! This go helep raise awareness of the tools we dey build.

Join our Slack!

To follow up bumper to bumper on how latest advancements, join our community! You can do so by filling out the form at http://slack.openmined.org

Join a Code Project!

The best way to contribute to our community na to become code contributor! You fit go to PySyft GitHub Issues page and filter for "Projects". E go show you all the top level Tickets giving an overview of what projects you fit join! If you no wan join any project, but you wan code small, you fit look for more "one off" mini-projects by searching for GitHub issues marked "good first issue"

If you no get time to contribute to our codebase, but still like to lend support, you fit be a Backer on our Open Collective. All donations wey we get na for our web hosting and other community expenses such as hackathons and meetups! meetups!

OpenMined's Open Collective Page


In [ ]:


In [ ]: